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Abstract. A quantitative study of the clustering properties of the cosmic web 
as a function of absolute magnitude and colour is presented using the SDSS 
Data Release 7 galaxy survey. Mark correlations are included in the analysis. 
We compare our results with mock galaxy samples obtained with four different 
semi-analytical models of galaxy formation imposed on the merger trees of the 
Millenium simulation. The clustering of both red and blue galaxies is studied 
separately. 
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1. INTRODUCTION 

The 200 years history of the Tartu Observatory is strongly linked with the 
exploration of the Earth and space at different scales. In the early 19th century, 
the triangulation along the Tartu Meridian Arc, 3000 km across Europe, helped to 
determine the size and precise shape of the Earth. First stellar parallax measure- 
ments (besides Bessel) by Wilhelm Struve, the founder of the Tartu Observatory, 
provided the basis for exploring our neighborhood within the Milky Way. The 
dynamical distance measurements of the Andromeda nebula and other island uni- 
verses by Ernst Opik in 1918-1922 opened the way to the first systematic works 
in the field of extragalactic astronomy. 

The study of the large scale distribution of galaxies became an important re- 
search subject already over 50 years ago with the notion of filamentary structure as 
revealed by the Lick galaxy survey (Shane & Wirtanen 1954). The impression of a 
cellular structure of the Universe with dominance of filaments and large voids in the 
galaxy distribution was developed during the period 1974-1980 at the cosmology 
school of Tartu Observatory (Joeveer, Einasto & Tago 1978; Einasto, Joeveer & 
Saar 1980). These results were presented at the lAU Symposium No. 79 at Tallinn 
(Longair & Einasto 1978) where an exposition of the pancake theory of large scale 
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structure formation was presented by Zel'dovich, Doroshkevich, Shandarin, Sigov 

and Kotok (sec e.g. Zel'dovich 1978). Already at this time, galaxy formation in 
proto-clusters was discussed by Doroshkevich, Saar & Shandarin (1978). 

A quantitative description of the galaxy clustering was provided for the first 
time by Totsuji & Kihara (1969) establishing the power law dependence of the 
angular auto-correlation function. However, the true spatial distribution became 
obvious only with the advent of the Harvard-Smithonian Center for Astrophysics 
redshift surveys (Huchra et al. 1983; Geller & Huchra 1989). The quantitative 
properties of the spatial clustering were provided by Davis & Peebles (1983) and 
Efstathiou & Jedrzejewski (1984). Later, more extended surveys confirmed the 
power law behaviour of the correlation function, in particular the Automatic Plate 
Measuring survey (Efstathiou 1993); the Las Campanas Redshift Survey (Tucker 
et al. 1997); the Two-degree-Field Galaxy Redshift Survey (Madgwick et al. 2003), 
and the Sloan Digital Sky Survey (Li et al. 2006, Swanson et al. 2008). In these 
and related studies it was shown that the chistering of galaxies strongly depends 
on their magnitudes, morphological types, and colours (e.g. Davis & Geller 1976; 
Loveday et al. 1995; Zehavi et al. 2010). 

We have been involved in a detailed analysis of the cosmic web using both 
modern redshift surveys and numerical simulations of galaxy formation together 
with colleagues from Tartu. Building on standard techniques such as those used 
in Tucker et al. (1997) we analyze here the largest SDSS galaxy redshift catalogue 
presently available. We also present an analysis of mark correlation functions. The 
aim of this contribution is to investigate the distribution of galaxies and its relation 
to the underlying dark matter density field within the standard ACDM paradigm. 
We perform a correlation analysis depending on the absolute magnitude and colour 
of observed galaxies and compare the results with a series of semi-analytical models 
of galaxy formation imposed on the Millenium simulation (Springel et al. 2005). 

2. DATA AND MOCK SAMPLE SELECTION 

We study the cosmic web using the SDSS Data Release 7, the largest near 
field galaxy redshift survey available. The survey is complete and comprises a 
large contiguous region of the Northern Galactic cap with 7500 deg^. Photometric 
calibration and fc-correction to redshift 2; = is done according to Hogg et al. 
(2002) using the galactic extinction measurements of Schlegel et al. (1998). We 
employ absolute Petrosian (1976) AB-magnitudes and use the New York University 
Value-Added Galaxy Catalog (Blanton et al. 2005). 

Starting from the observed i?-band magnitude and redshift distributions, we 
define two sets of volume-limited galaxy samples as illustrated in Fig. 1 (see Table 
1). The first set of volume-limited samples (ml to ml2) is used to investigate the 
dependence of the auto-correlation function on absolute magnitude. The samples 
are selected in order to cover a large magnitude range and to enclose a sufficient 
number of galaxies for the analysis. Therefore, the samples partially overlap, each 
separate sample contains however a significant number of independent objects to 
derive the auto-correlation functions. The second set (rl, r2, r3) was selected to 
cover a large range of magnitudes. This allows us to investigate the magnitude de- 
pendence of clustering using mark correlation functions. We impose a subdivision 
into red and blue galaxies applying least squares fitting through the green valley in 
the U—R and R plane, which leads to a separation line U—R = 1.8— 0.05 x (i?-|-19). 

For comparison we use four sets of mock galaxy samples constructed using 
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Table 1. Properties of the SDSS volume-limited samples. The correlation length, ro, 
of the different samples is given for samples ml - ml2 (for blue galaxies only ml - m7). 



Sample 






-^low 


•^up 


Number 


Red 


Blue 


r„(all) 


ro{red) 


ro(blue) 


ml 


-18.35 


-19.86 


0.020 


0.056 


42165 


17 801 


24 364 


6.33 


8.72 


4.58 


m2 


-19.08 


-20.43 


0.026 


0.078 


86 272 


45 531 


40 741 


6.45 


7.83 


4.81 


m3 


-19.73 


-20.94 


0.032 


0.105 


129 802 


79 097 


50 705 


7.29 


8.35 


5.36 


iii'l 


-20.28 


-21.40 


0.040 


0.136 


161913 


107837 


54 076 


7.49 


8.26 


5.65 


m5 


-20.76 


-21.82 


0.049 


0.169 


161392 


114 573 


46 819 


8.22 


8.85 


6.16 


m6 


-21.16 


-22.20 


0.058 


0.20 


172 264 


94 975 


32 289 


8.94 


9.74 


6.99 


m7 


-21.49 


-22.54 


0.068 


0.20 


69 787 


55 468 


14 419 


9.07 


9.60 


7.70 


m8 


-21.77 


-22.86 


0.078 


0.20 


32 677 


27432 


5 245 


10.11 


10.50 




m9 


-21.98 


-23.16 


0.090 


0.20 


15 545 


13 597 


1948 


11.40 


11.79 




mlO 


-22.15 


-23.43 


0.102 


0.20 


8 343 


7483 


860 


12.07 


12.45 




mil 


-22.26 


-23.70 


0.116 


0.20 


5 077 


4614 


463 


12.81 


13.05 




ml2 


-22.36 


-23.96 


0.130 


0.20 


3120 


2856 


264 


13.29 


13.70 




rl 


-18.51 


-20.77 


0.03 


0.06 


63 546 


31464 


32 082 








r2 


-19.39 


-22.28 


0.06 


0.09 


125 491 


76 733 


48 758 








r3 


-20.01 


-23.16 


0.09 


0.12 


114 266 


74612 


39 654 









the Millenium simulation. It follows the evolution of dark matter haloes and 

sub-haloes using 2160^ particles in a large box of 500 Mpc length on a side. 
Galaxy catalogues arc modeled using semi-analytical models of galaxy formation 
from merger trees of haloes in the simulation. The model of Croton et al. (2006, 
hereafter C06) implements AGN feedback in two channels to efficiently suppress 
star formation in high mass haloes ('quasar' and 'radio' modes) , thereby forming a 
realistic population of elliptical galaxies. The model of De Lucia & Blaizot (2007, 
hereafter D07) builds on the first model and improves the treatment of satellite 
mergers, using a more realistic dust model and a different initial mass function for 
the stellar population synthesis. The third catalogue of mock galaxies, produced 
by Font et al. (2008), includes a modelling of ram pressure stripping of satellite 
galaxies by hot gas inside large dark matter haloes. In this way, the luminosity 
function of faint red galaxies is better reproduced. Finally, the model of Guo et 
al. (2011, hereafter Gil) improves the treatment of the cooling flow regime and 
the rapid gas inflow, and it updates some parameters related with star formation 
and feedback processes. The mock galaxy samples are constructed applying the 
same angidar selection as in the observations as well as the magnitude and redshift 
ranges provided in Table 1. 

3. CORRELATION ANALYSIS 

The correlation functions are evaluated using the Landy & Szalay (1993) esti- 
mator. Data-data, data-random and random-random pairs are generated with the 
same angular selection function of observations and the redshift bounds given in 
Table 1, however not taking into account the fiber separation limit of the SDSS. 
The estimator reads as follows 

, . ^ {DD{r) - 2DR(r) + RR{r)) 
^^'^ {RR{r)) 
Errors are estimated using 10 bootstrap resamplings of the data. Fig. 2 shows the 
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Fig. 1. Magnitude and redshift boundaries of tlie 12 (left panel) and 3 (right panel) 
volume-limited galaxy samples for a large coverage in depth (m samples) and magnitude 
(r samples), respectively. 



convex form of the correlation function over the range from 0.2 — 50 Mpc. The 
solid line in the left panel shows the result corresponding to all galaxies for the 
sample ml. Additionally, a power law fit at the correlation length scale, i.e. where 
^(r) = 1, is also shown. The dashed line stems from red galaxies and lies about 
0.2 dex above that of the full galaxy sample, the dot-dashed line stems from blue 
galaxies lying about 0.15 dex below. The slope of the power law is about 7 = 1.4 
for all samples. For the remaining datasets we get similar results, however, the 
difference of the clustering strength between red and blue galaxies gets smaller as 
magnitudes increase. 

The right panel of Fig. 2 shows the ratio between the full correlation functions 
of the sample m4 and all four mock catalogues. For clarity, error bars are only given 
for the upper and lower curves. The correlation functions of models COG (solid line) 
and Gil (dot-dashed line) reproduce the shape of the observed correlation function 
over almost all spatial scales. However, the clustering amplitude is underpredicted 
by about 20 percent. Acceptable results are also obtained for the model D07, 
while F08 overpredicts the clustering of close pairs by up to a factor of two. The 
correlation function of other samples behave in a similar way. 

The results can be described in a compact form evaluating the change of the 
correlation length as a function of absolute magnitude. The left panel of Fig. 3 
shows the correlation length for the mean absolute i?-magnitudes of samples ml 
to ml2. The solid, dashed and dot-dashed lines correspond to all, red, and blue 
galaxies, respectively. The correlation length difference between red and blue 
galaxies decreases from about 4 Mpc at i? = —18.4 to 2 Mpc at iZ = —21.5. 
As seen in the figure, the brighter samples are dominated by red galaxies. The 
right panel shows the results corresponding to the Gil model. The correlation 
lengths of all and blue galaxies stay nearly constant between R = —18.4 and 
R = —21, while the correlation length of red galaxies decreases. This is due to the 
large number of satellites present among faint galaxies (cp. also Weinmann et al. 
2006) that tend to cluster more strongly than field red galaxies with R = —21. At 
brighter magnitudes the correlation length increases due to the higher bias of more 
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Fig. 2. Left; Two-point auto-correlation function for sample ml with all galaxies 
(solid line), red galaxies (dashed line) and blue galaxies (dot-dashed line). For the full 
sample, a power law fit, ^ = (r/ro)^ '', centered at the correlation length scale, ro, is 
shown. Error bars for the full sample are partly smaller than the line thickness. Right: 
Ratio between model correlation functions and SDSS galaxies for the m4 sample. The 
different semi-analytic mock samples considered are those of COG (solid line), D07 (dotted 
line with error bars), F08 (dashed line with error bars), and Gil (dash-dotted line). 



massive haloes with respect to the underlying mass distribution. The remaining 
semi- analytical models display similar trends. 

The ratio between the observed correlation length of red and blue galaxies and 
those corresponding to the semi-analytical models considered here can be seen in 
Fig. 4 (left and right panels respectively). In general, most models can explain 
the clustering amplitiide of galaxies as measured by the correlation length with 
about 20 percent accuracy. However, there is a general trend for bright blue 
galaxies to be too weakly clustered. This is probably due to the fact that massive 
haloes display a too efficient star formation which therefore appear too bright 
for a given clustering strength. The trend showed by red galaxies is in principle 
similar. A remarkable exception can be seen at the faintest magnitude bin due to 
the efficient feedback implemented in the models. The other important exception 
is the increase observed for R < —21 in model C06 which is due to the strong 
quasar feedback implemented that makes bright red galaxies to be hosted by too 
massive and, therefore, too strongly clustered haloes. 



4. MARK CORRELATION FUNCTION 

The trends already discussed for the clustering amplitude of galaxies using 
the standard two-point correlation function can be further investigated by means 
of the mark correlation function (e.g. Beisbart, Kerscher & Mecke 2002). This 
statistical estimator is defined as the average of the inner galaxy properties m - 
here taken as color index U — R or R magnitude as a func;tion of separation r 
and can be written as ((m) is the average over the mark on the whole sample) 

km{r-\n r2\)- 

The left panel of Fig. 5 shows the mark correlation function of the samples ml 
and m6 (solid and dashed lines respectively) compared to the corresponding mock 
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Fig. 3. Correlation lengtli as a function of mean _R-magnitude. Left: SDSS samples 
from ml to ml2 for all galaxies (stars and solid line), red galaxies (open squares and 
dashed lines), and blue galaxies (open diamonds and dash-dotted line). Right: idem 
as left panel but for mock samples in the Gil model. Error bars represent 2 standard 
deviations, they are smaller than the symbols. 




Fig. 4. Ratio of the correlation length of mock and SDSS data as a function of mean 
i?-magnitude. Left: ml to ml2 samples for rod galaxies. Right: ml to m7 samples 
for blue galaxies. The different semi-analytic mock samples considered are those of COG 
(asterisks and solid line), D07 (asterisks and dotted lines), F08 (open diamonds and 
dashed line) and Gil (open squares and dot-dashed line). Errors are again 2 standard 
deviations. 

samples for model F08 (dotted lines) using U — R colours as a mark. Interest- 
ingly, there is a significant signal over a distance of about 10 Mpc where the 
samples show redder U — R colours than the average. For the smaller scales this 
enhancement is about 0.05 to 0.1 mag. The excess of red neighbours is the result 
of the morphological transformation of galaxies by direct and tidal interactions. 
Since this effect is much stronger for faint galaxies it is natural to find a higher 
signal for sample ml. Below 1 Mpc our mock galaxies show a too strong mark 
correlation function. Obviously, the suppression of star formation in close galaxy 
pairs is overestimated in the models. The same behaviour is seen for the other 
mock samples. 

As can be seen in the right panel of Fig. 5 when using absolute magnitudes 
as a mark the resulting signals arc much weaker. The correlations for the samples 
rl and r3 are shown as solid and dashed lines, while measures below and above 
ku,R = 1 correspond to U- and i?-bands, respectively. This means that close pairs 
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Fig. 5. Left panel: Mark correlation function using the U — R colour as mark for 
samples ml (solid line) and m6 (dashed line) in comparison with mock samples given 
by the F08 model (dotted lines). Right: Mark correlation function using the ?7-band 
{ku < 1) and _R-band {kn > 1) absolute magnitudes as mark for samples rl (solid line) 
and r3 (dashed line) in comparison with mock samples given by the F08 model (dotted 
lines). For claxity, mock galaxies here are only compared with sample rl. Errors axe one 
standard deviation. 

with a separation up to 10 Mpc are brighter in the R band and dimmer in 
the U band by less than 0.005 mag. Despite the fact that the effect is weak, the 
result is significant as the corresponding error bars show. In this case errors are 
estimated using 100 samples with randomly reshuffled marks. 

3. DISCUSSION 

The clustering of SDSS galaxies was previously discussed by Zehavi et al. 
(2010) mainly using the angular correlation function. Although this approach 

has the advantage of being independent of rcdshift space distortions, it uses only 
part of the information encoded in the galaxy distribution. However, results con- 
cerning the colour and magnitude dependence of clustering are similar to ours. 
Interestingly, the clustering of faint galaxies with R> —21 is only weakly depen- 
dent on magnitude. In contrast, brighter galaxies are increasingly strong clustered 
as clearly seen from the luminosity dependence of the correlation function. 

We compared the clustering of SDSS galaxies with a large set of model galaxy 
samples based on the merger trees of the Millenium simulation that assume dif- 
ferent semi-analytical prescriptions for galaxy formation models. These different 
theoretical models are able to qualitatively reproduce the clustering dependence 
as a function of magnitude and colour. However, quantitatively, still there exist 
significant differences, with the F08 model showing the smallest discrepancies for 
scales above 1 Mpc. 

In addition to the standard two-point correlation technique, we carried out 
a new analysis using mark correlation functions which is suitable to assess the 
strength of galaxy transformations linked to their formation process. Surprisingly, 
we found a significant signal for galaxy pairs with a separation up to 10 Mpc 
depending on colour, and to a weaker extent, on absolute magnitudes. 

It is our plan to continue the study of the properties of the galaxy distribution 
and its connection with the large scale density field using mark correlation tech- 
niques. To characterize the density field we combine cosmological simulations with 
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a galaxy group catalog to get the positions of suspected dark matter haloes. In 
extrapolating the mass density into the zones of influence of each halo we estimate 
the fine scale density field that reproduces both, the observed large scale galaxy 
distribution, and the average density profile around each group (Muiioz, Miiller 
& Forero-Romero 2011). This approach will therefore allow to further investigate 
the relation between the galaxy properties and their environmental density aiming 
at improving our knowledge of the cosmic web. 
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